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Abstract The flux/transmission power spectrum has become a popular statistical 
tool in studies of the high redshift (z > 2) Lyman-alpha forest. At low 
redshifts, where the forest has thinned out into a series of well-isolated 
absorption lines, the motivation for flux statistics is less obvious. Here, 
we study the relative merits of flux versus line correlations, and derive 
a simple condition under which one is favored over the other on purely 
statistical grounds. Systematic errors probably play an important role 
in this discussion, and they are outlined as well. 



1. Introduction 

Weinberg (this volume) has given a superb review of advances in our 
understanding of the high redshift Lyman-alpha forest and its connection 
to the cosmic web (Bond, Kofman & Pogosyan 1996). Much recent work 
has focused on the the flux/transmission power spectrum, an approach 
pioneered by Croft et al. (1998) (see also Hui 1999). There are several 
different definitions in the literature. The one we adopt here is: 
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where is the two-point flux correlation (Av specifies the lag in veloc- 
ity), and Pf is its Fourier-transform, the flux power spectrum. Here / 
is simply the transmission / = e~ T where r is the Lyman-alpha optical 
depth. The symbol / denotes the mean transmission. Finally, k is the 
wave-number in units of inverse velocity. 
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The flux-statistics above, which treats the transmission fluctuations 
on a pixel-by-pixel basis, is motivated by a physical picture in which the 
forest arises from continuous fluctuations in the intergalactic medium, 
rather than discrete, well-isolated clouds (Bi, Boerner h Chu 1992, Cen 
et al. 1994; for additional ref., see Hui et al. 1997 and ref. therein). A 
second class of statistics, which has a longer history, treats the transmis- 
sion fluctuations on a line-by-line basis. The counting of absorption lines 
in terms of their properties, such as the column density distribution, falls 
into this category. The analog of the flux two-point correlation, or power 
spectrum, is the line correlation or power spectrum, defined as: 
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where n(v) is the number density of lines, and n is its mean. Implicit 
in this definition is that one studies the correlation of absorption lines 
within some range of column density or equivalent width, or above some 
threshold. 

The respective motivations for line and flux statistics are probably 
both valid, depending on circumstances. For the low column density 
forest which probably arises from smooth fluctuations, flux statistics 
seems reasonable. For the higher column density systems, which likely 
arise from well- isolated galactic or pre-galactic halos, line statistics seems 
to provide a good characterization. The aim of this short note is to 
ask a purely statistical question, irrespective of the underlying 
physical picture: which kind of statistics can one measure with 
more precision? 

2. Statistical Error Analysis for the Flux vs. 
Line Power Spectrum 

The statistical error can be worked out for both the two-point corre- 
lation function and the power spectrum. The result is somewhat simpler 
to state in Fourier space, and so we will focus on the power spectrum. 
The Fourier space description has the additional advantage that the 
powers in separate wave-bands are uncorrelated, provided that the fluc- 
tuations are Gaussian random. The latter is a crucial assumption in our 
discussion below - the fluctuations in flux or number density of lines 
are almost certainly not exactly Gaussian random. However, because 
correlations seen in the forest are often quite weak, Gaussianity is not a 
bad approximation; at least, it provides us a way to gauge the relative 
importance of shot-noise and the correlation signal, as we will see. By 
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the central limit theorem, lower resolution data also tend to be more 
Gaussian random. 

The statistical dispersion in the measured flux power spectrum is given 
by: 

(SP f (k)Y 2 = -^{P f (k) + Ny 1 ) (3) 

where is the number of Fourier modes in the waveband of interest 
(which is centered at k) i.e. if the waveband has a width of Ak, = 
Ak/(2n/L) where L is the length of the quasar absorption spectrum (if 
one has more than one line of sight, one adds the error in quadrature in 
the usual way). 

The quantity Nf (not to be confused with 7V fe ) gives us a measure of 
the signal-to-noise of the data: the smaller Nf is, the larger the shot- 
noise. To be precise, 

where dv is the velocity width of each pixel, N is the number of pixels, 
var(i) is the variance of counts in pixel i, and Nq{i) is the mean quasar 
photon count in pixel i (e.g. for a flat continuum, Nq would be inde- 
pendent of i). A useful approximation (accurate to within a factor of 
two or so) to the shot-noise Nf^ 1 is given by (dv / f)(N / S) 2 where / is 
the mean transmission as before, and N/ S is the average noise-to-signal 
ratio at the level of the continuum. 

Eq. (3) is derived in Hui, Buries et al. (2001). Its intuitive meaning 
is quite apparent if one writes down the fractional error: 

One can see that 1. the longer the spectrum is, the larger the number of 
modes N^, and therefore the smaller the fractional error; 2. the larger 
the intrinsic signal (i.e. Pf(k)), the smaller the fractional error; 3. the 
more noisy the spectrum is, the larger Nf^ 1 is, and therefore the larger 
the error. 

How about the statistical error for the line power spectrum? 

The expression is very similar. The fractional error for the line power 
spectrum is: 
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where P n is the line power spectrum, is the same number of modes 
in the waveband centered at k, and n is the number density of lines. 
The intuitive meaning of this expression is also quite clear: the smaller 
the number density of lines, the larger the fractional error. The only 
difference between eq. (5) and (6) is that Nf~ l has been replaced by 
n _1 . In other words, shot-noise from photon-counts is replaced by shot- 
noise from the finite number of absorption lines. 

Before we draw conclusions from these two expressions, we should note 
that our results for the statistical error assume the quadratic estimator 
for the respective power spectrum is of a particular form (known in the 
large scale structure literature as (DD — 2DR + RR)/RR; Landy & 
Szalay 1993); other forms generally lead to larger errors. We refer the 
reader to the discussion in Hui et al. (2001) for details. The discussion 
there focused on the flux statistics, but very similar reasoning applies to 
line statistics as well. 

3. Discussion 

Eq. (5) and (6) in the last section give the respective fractional error in 
flux power spectrum and line power spectrum. From the two expressions, 
it is plain to see that the flux power spectrum can be measured with a 
higher statistical precision than the line power spectrum if 

N f > nPjPf (7) 

where Nf ~ (S/N) 2 f/dv (eq. [4]) is roughly the typical signal-to-noise- 
squared per km/s of the quasar spectrum, n is the number density of 
lines, P n is the line power spectrum, and Pf is the flux power spectrum. 

At z ~ 3, all quantities on the right hand side have been measured, 
so we can derive the condition on the S/N above which the flux power 
spectrum can be measured with greater precision. The result depends 
of course on the scale of interest. Let us pick a typical scale of around 
k ~ 0.01 s/km (or velocity separation of about 300 km/s). At this scale, 
P n /Pf is about 100, depending on the column density of the absorption 
lines (a lower column density cut of ~ 10 14 cm~ 2 ; including more low 
column density lines would decrease this ratio) (see Cristiani et al. 1997 
and McDonald et al. 2000), while n ~ 2 x lO^km/s)" 1 (Kim et al. 
2002). Finally, / ~ 0.65. Hence, the requirement for favoring flux over 
line power spectrum is: 

{S/Nf/dv > 0.3(km/s)- 1 (8) 

One can see that this is not a very stringent requirement on the signal- 
to-noise at all. For high quality Keck spectra, signal-to-noise of several 
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tens per resolution element (dv ~ 10 km/s) is quite typical, and so 
(S/N) 2 /dv S> 0.3(km/s) _1 . For noisy, low-resolution spectra such as 
those obtained from the Sloan Digital Sky Survey, (S/N) 2 /dv ~ 10~ 2 — 1, 
it looks as though the line power spectrum might be favored, but one 
must keep in mind that for low-resolution data, both n and P n are much 
reduced, and the requirement on (S/N) 2 /dv can be relaxed by as much 
as a factor of 100. 

The situation at lower redshifts z < 2 is more uncertain. This is be- 
cause no measurements have been made of the flux power spectrum at 
low redshifts, although much is known abut the absorption-line number 
density and clustering (e.g. Weymann et al. 1998, Impey 1999, Penton 
et al. 2000, Dave & Tripp 2001, Chen et al. 2001, Bechtold et al. 2002). 
Both n and Pf drop as one goes to lower redshifts, although P n tends 
to increase (this statement is cut-off dependent; we assume here a fixed 
column-density or equivalent- width threshold). One possibility is to as- 
sume that nP n /Pf stays roughly constant, in which case eq. (8) remains 
a valid requirement on the signal-to-noise of the data. Instruments on- 
board HST frequently yield spectra that satisfy this requirement. It 
must be emphasized, however, Pf has yet to be measured at low red- 
shifts, and, if measured, one must go back to the expression in eq. (7) 
to draw the appropriate conclusion. 

To end our discussion, it is important to underscore the fact that our 
discussions so far focus entirely on the issue of statistical error. Sys- 
tematic errors could make a significant difference to the conclusion one 
draws, as emphasized by several members of the audience. Two sources 
of systematic errors were brought up. One is that the efficiency of the 
spectrograph or detector might not be sufficiently well-characterized to 
allow an accurate flux correlation measurement. However, if the effi- 
ciency has small-scale fluctuations that are not well- understood, neither 
should one trust the absorption-line measurements. Second, spurious 
power introduced by the continuum might be more of an issue for the 
flux correlation than for the line correlation. This is certainly a poten- 
tial worry. One should keep in mind, however, that continuum-fitting 
is in fact easier at low redshifts than at high redshifts, because of the 
thinning out of the forest (although continuum-fitting is actually not rec- 
ommended as part of the data reduction; see Hui et al. 2001). The im- 
portant question is: what is the scale below which the forest fluctuation 
dominates over the continuum fluctuation (recall that the continuum is 
smooth while the forest has lots of small scale structure)? At z ~ 3, 
this scale is about k ~ 0.001 s/km (or velocity separation of about a few 
thousand km/s). As one goes to lower redshifts, the forest flux power 
Pf drops, and so this scale must move to a smaller value (or higher k). 
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The issue is whether this scale is still sufficiently large to be interesting. 
At the very least, the author hopes that this short note will provide a 
stimulus to measure the flux power spectrum from low redshift quasar 
spectra. Measurements from actual data are certainly far more useful 
than speculations from a theorist. 

Thanks are due to the organizers of the IGM conference, especially 
Mary Putnam and Jessica Rosenberg, for gently and patiently urging the 
author to write up his talk, and to Todd Tripp for useful discussions. The 
interest expressed by Chris Impey in the issues discussed here has also 
provided an important motivation. This short paper covers the second 
half of the conference presentation. For the first half on the galaxy-IGM 
connection at z ~ 3, see Hui & Sheth (2002, in preparation); for related 
observational results, see Adelberger et al. (2002). Support for this work 
is provided by an Outstanding Junior Investigator Award from the DOE, 
an AST-0098437 grant from the NSF, and by the DOE at Fermilab, and 
NASA grant NAG5-10842. 
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